Abstract
Introduction
Multiple myeloma has a complex transcriptomic landscape with gene mutations, fusions, and spliced transcript isoforms all contributing to gene expression. Short-read multiomics studies have offered insights but lack the resolution to fully capture this complexity. Long-read RNA sequencing, with its ability to cover full RNA fragments allowing more accurate isoform mapping, could offer improved resolution into the transcriptomic landscape of myeloma. We have generated the first long-read RNA sequencing cohort of newly diagnosed myeloma (NDMM) patients from the UKMRA RADAR trial with the aim to assess its capture of expressed variants alongside gene expression in myeloma.
Methods
The UKMRA RADAR trial is a prospective, national, multi-centre, risk-adapted, response-guided multi-arm, multi-stage (MAMS) phase II/III trial in NDMM eligible for ASCT (ISRCTN46841867). We optimised both experimental and computational protocols for Oxford Nanopore Technologies (ONT) long-read RNA sequencing. High quality RNA was obtained from CD138+ selected diagnostic bone marrow cells. Primer optimisation using native ONT indexes and in-house homotrimer UMIs minimized concatenation and PCR bias. The resulting cDNA libraries were multiplexed and sequenced on the PromethION platform. We developed a novel quality control tool 'Splitfastqcats’ to ensure full length read filtering (mean length = 1085bp), followed by pre-processing and alignment using our in-house TallyTriN pipeline (mean mapped reads = 6.94 x 106 /sample). Prepublished long-read tools were compared for consensus: gene expression was called with Salmon; isoforms and splicing variants with Bambu and Flair; mutations with Clair3-RNA; and fusions with a consensus between Jaffal, CTAT-LR, genion and FusionSeeker. Stringent filtering was based on variant allele frequency (VAF), read depth, number of supporting reads, and high confidence calls from the given tool filter. Results were compared to targeted region DNA sequencing using the Myeloma Genome Panel (MGP) (PMID:35522533).
Results
Overall, baseline trial entry samples from 56 NDMM patients were analysed including 75% standard and 25% high risk patients. This represented the general myeloma population with 12%, 14%, and 5% bearing a t(4;14), t(11;14) and t(4;16) by FISH respectively.
We were able to identify 100% of t(4;14) patients (n=7) by expression of the NSD2-IGH fusion transcript, as well as a subset of t(11;14) and t(14;16) patients by MYEOV-IGH/KMT5B-IGH (n=2, 25%) and MAF-IGH (n=1, 33%) fusions respectively. We defined a long-read Translocation-Cyclin (TC) classification that captured all IG translocations via overexpressed partner genes. In addition to differential transcript expression and isoform mapping, we also derived the location of the breakpoint of expressed IGH-fused transcripts, especially relevant as t(4;14) breakpoint clusters have been associated with differential risk.
We were able to identify expressed known driver mutations described in the literature and concordant with MGP calls, including non-synonymous KRAS (n=17, 30%), NRAS (n=5, 9%) and DIS3 (n=4, 7%) mutations. Final comparison to the MGP will be presented at the conference. RNA editing sites in common driver UTRs and intronic regions were also highlighted, matching consensus sites in the REDIportal database, with the most edited genes including CHEK1 (n=6, 11%), IKZF3 (n=10, 18%), and PSMB2 (n=21, 37%).
Conclusion Long-read RNA sequencing enables derivation of cytogenetic variant subgroups and mutation identification alongside reported gene expression-signature based prognostication, and reveals new granularity around driver variant expression, transcript splicing and RNA editing. This is the first study that assesses a large cohort of myeloma patient transcriptomes using long-read RNA sequencing, utilising an in-house designed sequencing pipeline and novel computational algorithms to provide clinically relevant information using one platform. This could allow rapid profiling of patients with minimal sample input and lower cost than current approaches. We continue to add layers of analyses to our dataset, including gene expression profiling and aberrant splicing signatures.
This feature is available to Subscribers Only
Sign In or Create an Account Close Modal